Updating messaging data structures to include predicted attribute values associated with recipient entities

ABSTRACT

This disclosure involves modifying messaging data having unknown attribute values associated with entities to facilitate retrieval of address data for communications with the entities. For example, a system accesses a mapping of first addresses to an attribute, wherein the first addresses include (1) a target address for a target entity and (2) addresses associated with first entities in turn associated with first known values of the attribute. The system accesses a mapping of second addresses to an attribute, wherein the second addresses include (1) the target address for the target entity and (2) addresses associated with second entities in turn associated with second known values of the attribute. The system determines distributions of the first known values and the second known values, predicts a value of the attribute for the target entity based thereon, updates the messaging data therewith, and services a query for addresses having the predicted value.

RELATED APPLICATIONS

This disclosure claims priority to U.S. Provisional Application No. No.62/315,143, entitled “Predicting User Attributes Based on ElectronicCommunications Involving Users,” filed Mar. 30, 2016, the entirety ofwhich is hereby incorporated by reference herein.

FIELD OF THE INVENTION

This disclosure relates generally to computer-implemented methods andsystems for managing the content of a messaging data structure tofacilitate the retrieval of information used for communication via adata network, and more particularly relates to updating messaging datastructures to include predicted attribute values associated withrecipient entities and thereby facilitating retrieval of address datafor electronic communications with the recipient entities.

BACKGROUND

Messaging data structures, such as databases, store information that isused for communication of electronic message via a data network. Amessaging data structure can include a database or other data structurethat is used to store data samples with values of different attributesused in communicating electronic messages. For example, electronicmessages, such as e-mails and text messages, can be used by vendors andother senders to induce various recipient entities (e.g., customers andother users) to access online content. A communication system isaccessed by used by vendors and other senders to perform thesecommunications. The communication system uses a messaging data structureto manage these communications (e.g., by selecting certain groups ofrecipient entities to which electronic messages will be transmitted).

However, messaging data structures may include sub-optimal informationfor selecting recipient groups. For example, a messaging data structuremay lack attribute information for certain recipient entities (e.g.,missing attributes values for age, gender, geographic location, andother attributes). Thus, a communication system is unable to retrieve anaccurate listing of appropriate recipients for a given set of electronicmessages.

SUMMARY

This disclosure involves modifying messaging data structures havingunknown attribute values associated with recipient entities tofacilitate retrieval of address data for electronic communications withthe recipient entities. For example, a system accesses a first portionof a messaging data structure storing data identifying a first mappingamong an online electronic content service, first electronic addressessubscribed to the online electronic content service, and an entityattribute, wherein the first electronic addresses include (i) a targetelectronic address for a target recipient entity, the target electronicaddress having a local part and a domain part and (ii) a first pluralityof electronic addresses associated with first member recipient entities,wherein the first member recipient entities are respectively associatedwith first known values of the entity attribute in the first portion ofthe messaging data structure. The system also accesses a second portionof the messaging data structure storing data identifying a secondmapping of second electronic addresses, a common domain part identifiedin the second electronic addresses, and the entity attribute, whereinthe second electronic addresses include (i) the target electronicaddress for the target recipient entity and (ii) a second plurality ofelectronic addresses associated with second member recipient entities,wherein the second member recipient entities are respectively associatedwith second known values of the entity attribute in the second portionof the messaging data structure. The system subsequently determines afirst distribution of the first known values of the entity attributeaccessed from the first portion of the messaging data structure and asecond distribution of the second known values of the entity attributeaccessed from the second portion of the messaging data structure. Thesystem computes a predicted value of the entity attribute for the targetrecipient entity based on the first distribution and the seconddistribution, updates the messaging data structure with the predictedvalue, and services a query for electronic addresses having thepredicted value by retrieving data describing the target recipiententity from the messaging data structure.

These and other aspects, features and advantages of the presentinvention may be more clearly understood and appreciated from a reviewof the following detailed description and by reference to the appendeddrawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts an example of a system including a server system thatexecutes a message management application for predicting user attributesbased on electronic communications involving users, according to certainaspects of the present disclosure.

FIG. 2 depicts an example of a messaging data structure within arecipient database, according to certain aspects of the presentdisclosure.

FIG. 3 depicts an example of merging entity attribute data associatedwith various electronic addresses subscribed to various onlineelectronic content services, according to certain aspects of the presentdisclosure.

FIG. 4 depicts an example of merging entity attribute data associatedwith various electronic addresses subscribed to various onlineelectronic content services, generating distributions of the entityattribute data, and feeding back the distributions for predictingunknown entity attribute data, according to certain aspects of thepresent disclosure.

FIG. 5 depicts an example of modifying messaging data structures havingunknown attribute values associated with recipient entities tofacilitate retrieval of address data for electronic communications withthe recipient entities, according to certain aspects of the presentdisclosure.

FIG. 6 depicts examples of distributions of gender and age data foronline electronic content services to which an electronic addresssubscribes, according to certain aspects of the present disclosure.

FIG. 7 depicts examples of age distributions for different onlineelectronic content services, according to certain aspects of the presentdisclosure.

FIG. 8 depicts examples of gender distributions for different onlineelectronic content services, according to certain aspects of the presentdisclosure.

FIG. 9 depicts an example of using Bayesian inference to predict agender for a given electronic address, according to certain aspects ofthe present disclosure.

FIG. 10 depicts additional details of the example of using Bayesianinference to predict a gender for a given electronic address, accordingto certain aspects of the present disclosure.

FIG. 11 depicts additional details of the example of using Bayesianinference to predict a gender for a given electronic address, accordingto certain aspects of the present disclosure.

FIG. 12 depicts additional details of the example of using Bayesianinference to predict a gender for a given electronic address, accordingto certain aspects of the present disclosure.

FIG. 13 depicts additional details of the example of using Bayesianinference to predict a gender for a given electronic address, accordingto certain aspects of the present disclosure.

FIG. 14 depicts an example of a server system that executes a messagemanagement application for optimizing the effectiveness of differentelectronic message versions, according to certain aspects of the presentdisclosure.

DETAILED DESCRIPTION

Improved systems and techniques are disclosed for predicting an unknownvalue of an entity attribute based on electronic communicationsinvolving a target recipient entity and member recipient entities. Forexample, electronic messages such as e-mails may be transmitted to alarge pool of electronic addresses. The electronic messages such ase-mails may be transmitted to electronic addresses subscribed to anonline electronic content service. The electronic addresses cancorrespond to a target recipient entity and member recipient entities.The target recipient entity may be associated with an unknown value ofthe entity attribute, such as gender or age, and the member recipiententities may be associated with known values of the entity attribute. Amessage management application executed by a computing system cananalyze the known values of the entity attribute to generate aprediction of the unknown value of the entity attribute. The knownvalues of the entity attribute can be associated with member recipiententities sharing a common characteristic with the target recipiententity. Examples of such a common characteristic include a subscriptionto the same online electronic content service, the same electronicaddress domain, and/or the same or similar first name.

In a simplified example, a user associated with an electronic addresssuch as “joe.snuffy@domainX.xyz” may subscribe to a first onlineelectronic content service and a second online electronic contentservice. The message management application can access entity attributedata describing ages, genders, or other attributes for at least some ofthe electronic addresses subscribed to the first online electroniccontent service and the second online electronic content service. Themessage management application can generate a first age distribution forknown ages of member recipient entities subscribed to the first onlineelectronic content service (e.g., ages 30-40) and a second agedistribution for known ages of member recipient entities subscribed tothe second online electronic content service (e.g., ages 35-45).

Based on the overlap between the two age distributions associated withthe two online electronic content services, both of which include theelectronic address “joe.snuffy@domainX.xyz,” the message managementapplication can determine that a predicted age for the user having theelectronic address “joe.snuffy@domainX.xyz” is between 35 and 40 years.The message management application can update a messaging datastructure, which stores information for the electronic address“joe.snuffy@domainX.xyz,” to include the predicted attribute value forthis age attribute. Thus, subsequent queries to the messaging datastructure for electronic addresses associated with an attribute value of“35-40” will return the electronic address “joe.snuffy@domainX.xyz.”

Referring now to the drawings, FIG. 1 is a block diagram depicting anexample of a system including a server system 102 that executes amessage management application 104 for predicting user attributes basedon electronic communications involving users. The message managementapplication 104 can be used to generate, modify, select, or otherwiseuse one or more electronic messages 112 for electronic messages to betransmitted via a data network 130 (e.g., e-mails, multimedia messagesthat can be delivered to smart phones, push notification dialogs, webpages, etc.). The message management application 104 can also be used toanalyze and predict attribute data for users to whom various electronicmessages 112 are transmitted.

The server system 102 can communicate with one or more vendor systems132 and one or more recipient devices 136 via one or more signalscommunicated via one or more data networks 130. The server system 102can include one or more processing devices. In some embodiments, theserver system 102 can be a single server. In other embodiments, theserver system 102 can include multiple computing systems that areconfigured for distributed computer (e.g., grid-based computing, cloudcomputing, etc.).

The server system 102 can include or have access to one or morenon-transitory computer-readable media on which program code andelectronic data are stored. The program code includes a messagemanagement application 104. The electronic data includes one or moreelectronic messages 112.

The message management application 104 is executable by a processingdevice to perform one or more operations for predicting an unknown valueof an entity attribute based on data associated with the transmission ofelectronic messages 112. An electronic message 112 can includeelectronic data having interactive content, such as clickable images orother clickable content. The interactive content is used by clients toaccess online content 142 hosted on a web server 140 or other server.For example, the message management application 104 can configure theserver system 102 to define a campaign, a marketing program, anadvertising plan, or other operation involving the transmission ofelectronic messages via one or more data networks 130.

The message management application 104 can include one or more suitablesoftware modules. In the example depicted in FIG. 1, the messagemanagement application 104 includes a user analytics module 106, amessage editing module 108, and an address management module 110. Theuser analytics module 106 can be used to predict an unknown value of anentity attribute based on electronic messages transmitted to varioususers. Predicting an unknown value of an entity attribute can include,for example, determining an estimated age range for a user associatedwith an electronic address, determining a predicted gender for a userassociated with an electronic address, determining an estimatedgeographic location for a user associated with an electronic address, orgenerating any other prediction of an attribute describing a userassociated with an address to which electronic messages 112 can beprovided.

The message editing module 108 can provide tools that enable a user tocreate and edit user content. For example, a vendor application 134executed at a vendor system 132 can access the message editing module108 via a data network 130 to create one or more electronic messages fortransmission to recipient devices 136. In some embodiments, the messageediting module 108 may provide tools that enable a user to create andedit e-mail messages such as may be used in e-mail campaigns. An e-mailcampaign is used herein to refer to the process of sending an e-mail(generally the same e-mail) to a particular group of people.

In some embodiments, one or more of the user analytics module 106 andthe message editing module 108 can communicate with an e-mail server144. The e-mail server 144 can prepare and send e-mails or otherelectronic messages in a campaign to users using electronic addressesstored in address lists of a recipient database 126. Addresses in therecipient database 126 may be entered and organized using tools providedby the address management module 110. In additional or alternativeembodiments, a separate e-mail server 144 can be omitted. For example,one or more of the user analytics module 106 and the message editingmodule 108 can communicate with an e-mail service or other suitablesoftware executed on the server system 102 and can thereby configure theserver system 102 to transmit e-mails or other electronic messages.

FIG. 2 is a block diagram depicting an example of a messaging datastructure 202 within the recipient database 126. For example, themessaging data structure 202 can map an electronic address 210 to anonline electronic content service 208 (e.g., based on the electronicaddress 210 being subscribed to the online electronic content service208), a first name 212, and an entity attribute 214. Examples of theonline electronic content service 208 can be the fictional World NewsWeekly and Developers Daily. Examples of the entity attribute 214 can beage and gender. The electronic addresses 210 that are subscribed toWorld News Weekly can include an electronic addressjoe.snuffy@domainX.xyz associated with the target recipient entity 204and electronic addresses for some of the member recipient entities 206.Similarly, the electronic addresses 210 that are subscribed toDevelopers Daily can include the electronic addressjoe.snuffy@domainX.xyz for the target recipient entity 204 andelectronic addresses for others of the member recipient entities 206.The electronic addresses 210 can be mapped to a first name 212. Theelectronic address joe.snuffy@domainX.xyz for the target recipiententity 204 can be mapped to an unknown value of the entity attribute214. Electronic addresses 210 for member recipient entities 206 can bemapped to known values of the entity attribute 214. As described herein,the user analytics module 106 can predict an unknown value of the entityattribute 214 associated with the target recipient entity 204 bydetermining a distribution of known values of the entity attribute 214associated with member recipient entities 206 subscribed to the sameonline electronic service 208 as the target recipient entity 204, memberrecipient entities 206 having a same or similar first name 212 as thetarget recipient entity 204, and/or member recipient entities 206 withelectronic addresses 210 on the same domain as the electronic addressfor the target recipient entity 204.

Referring back to FIG. 1, a vendor system 132 can include any computingdevice or group of computing devices that can access the messagemanagement application 104 to generate, modify, or otherwise use one ormore electronic messages 112. In some embodiments, a vendor system 132transmits one or more of the electronic messages 112 to the serversystem 102 (e.g., via e-mail, via an upload interface presented in a webbrowser executed at a vendor system 132, etc.). In additional oralternative embodiments, a vendor system 132 remotely accesses themessage management application 104 and uses the message managementapplication 104 to generate one or more of the electronic messages 112(e.g., via a design interface or a data entry interface presented in aweb browser executed at a vendor system 132).

The vendor system 132 depicted in FIG. 1 includes one or more processingdevices for executing one or more vendor applications 134. A vendorapplication 134 includes program code that can be executed at the vendorsystem 132 for transmitting, creating, editing, modifying, or otherwiseusing one or more electronic messages 112. For example, a vendorapplication 134 may be used to communicate with the message managementapplication 104 and to thereby generate and send online messages thatare associated with a marketing campaign. In some embodiments, a vendorapplication 134 can be a web browser application or other suitableapplication that is installed on a non-transitory computer-readablemedium accessible to a vendor system 132 and that can be used toremotely access one or more features of the message managementapplication 104. In additional or alternative embodiments, a vendorapplication 134 can be a dedicated application installed on anon-transitory computer-readable medium that is included in oraccessible to a vendor system 132.

The recipient device 136 depicted in FIG. 1 can be any computing devicethat accesses one or more other computing systems via the data network130. Non-limiting examples of recipient devices 136 include smartphones, tablet computers, laptop computers, etc. Each recipient device136 executes one or more user applications 138. A user application 138is any application suitable for receiving and interacting withelectronic messages 112 to which the server system 102 provides access.Non-limiting examples of user applications 138 include web browserapplications, e-mail applications, etc.

The web server 140 depicted in FIG. 1 can be any server, computingdevice, or combination of computing devices that provides access toonline content 142 (e.g., webpages) that is accessible via one or moreother data networks 130 (e.g., the Internet). Online content 142 mayinclude a website for purchasing products or services that are describedor depicted in electronic messages 112. Electronic messages transmittedto user devices can include links to the online content 142 hosted byone or more web servers 140.

For illustrative purposes, the server system 102, the vendor system 132,the web server 140, and the e-mail server 144 are depicted as separatesystems. However, other implementations are possible. For example, aserver system 102 may perform one or more of executing the messagemanagement application 104, executing the vendor application 134, andexecuting one or more web services that provide access to the onlinecontent 142 via the Internet.

The user analytics module 106 can be executed by the server system 102to predict an unknown value of one or more entity attributes 214 of oneor more target recipient entities with electronic addresses 210 in therecipient database 126. The analysis can be performed using known valuesof the one or more entity attributes 214 associated with memberrecipient entities 206 sharing a common characteristic with the targetrecipient entity 204, such as a subscription to the same onlineelectronic content service 208, the same electronic address domain,and/or the same or similar first name 212.

FIGS. 3 and 4 depict an example of merging known values of entityattributes 214 associated with various electronic addresses 210subscribed to various online electronic content services 208. Themessage management application 104 can receive, from one or more vendorapplications 134, one or more data sets describing various memberrecipients who will receive various electronic messages. A first dataset302 can include a first electronic message to be sent to at least twousers (“User₁” and “User₂”) at electronic addresses (“Email₁” and“Email₂”) on a given address list (“List₁”). A second dataset 304 caninclude a second electronic message to be sent to the two users (“User₁”and “User₂”) at the electronic addresses (“Email₁” and “Email₂”) on theaddress list (“List₂”). The first and second data sets 302 and 304 canalso include information such as device preferences (“Device pref”),geographic locations (“Geoloc”), and other entity attributes 214. Themessage management application 104 can consolidate information fromdifferent data sets into a merged data set 306 in the recipient database126. For example, a first electronic address (“Email₁”) can beassociated with various other certain attribute data (subscriptions,device preferences, etc.) in the recipient database 126, and a secondelectronic address (“Email₂”) can be associated with various otherentity attribute data (subscriptions, device preferences, etc.) in therecipient database 126.

The entity attribute data in the merged data set 306 in the recipientdatabase 126 can be used to generate distributions 402 of known valuesof various entity attributes 214. For example, the user analytics module106 or other suitable program code can be executed to generate adistribution of gender data for one or more domains, a distribution ofage data for one or more domains, a distribution of gender data for oneor more first names 212, a distribution of age data for one or morefirst names 212, a distribution of gender data for one or more onlineelectronic content service 208 (e.g., address lists for a given set ofe-mail content), a distribution of age data for one or more onlineelectronic content services 208, or any other suitable distribution ofdata.

The user analytics module 106 can use these distributions to generatepredictions of values of various entity attributes 214 that are unknownto the message management application 104. An unknown value of an entityattribute 214 can be predicted using one or more operations describedherein. For instance, FIG. 5 depicts an example of a process 500, whichmay be performed by the message management application 104 or anothersuitable computing system, that generates a prediction of an unknownvalue of an entity attribute 214, according to certain embodiments. Insome embodiments, one or more processing devices implement operationsdepicted in FIG. 5 by executing suitable program code (e.g., the useranalytics module 106). For illustrative purposes, the process 500 isdescribed with reference to certain examples depicted in the figures.Other implementations, however, are possible.

At block 502, the process 500 involves accessing a first mapping betweenelectronic addresses 210, which are subscribed to an online electroniccontent service 208, and known values of an entity attribute 214 (e.g.,an age attribute, a gender attribute, etc.). The electronic addresses210 include the electronic address for a target recipient entity 204without a known value of the entity attribute 214 and electronicaddresses 210 for member recipient entities 206 with known values of theentity attribute 214.

A processing device (e.g., one or more processors of the server system102) can execute one or more modules of the message managementapplication 102 (or suitable other program code) to implement block 502.For example, the program code for the message management application102, which is stored in a non-transitory computer-readable medium, isexecuted by one or more processing devices. Executing the messagemanagement application 102 causes the processing device to accessmapping data from the messaging data structure 202. The accessed mappingdata from the messaging data structure 202 can be stored in the samenon-transitory computer-readable medium or a different non-transitorycomputer-readable medium. In some embodiments, accessing the mappingdata involves communicating, via a data bus, suitable signals between alocal non-transitory computer-readable medium and the processing device.In additional or alternative embodiments, accessing the mapping datainvolves communicating, via a data network, suitable signals between acomputing system that includes the non-transitory computer-readablemedium and a computing system that includes the processing device.

In one example, the target recipient entity 204 associated with theelectronic address 210 “joe.snuffy@domainX.xyz” may be subscribed to thefictional online electronic content service 208 called “World NewsWeekly.” The gender attribute value associated with this targetrecipient entity 204 may be unknown. The user analytics module 106 canaccess known gender attribute values associated with other subscribersto “World News Weekly,” or member recipient entities 206, for use inpredicting the unknown gender attribute value associated with the targetrecipient entity 204.

At block 504, the process 500 involves accessing a second mapping ofelectronic addresses 210, which have a common domain part, with knownvalues of an entity attribute 214. The electronic addresses 210 includethe electronic address for a target recipient entity 204 without a knownvalue of the entity attribute 214. The electronic addresses 210 alsoinclude electronic addresses 210 for member recipient entities 206 withknown values of the entity attribute 214.

A processing device (e.g., one or more processors of the server system102) can execute the user analytics module 106 or one or more othermodules of the message management application 102 (or suitable otherprogram code) to implement block 504. For example, the program code forthe message management application 102, which is stored in anon-transitory computer-readable medium, is executed by one or moreprocessing devices. Executing the message management application 102causes the processing device to access mapping data from the messagingdata structure 202. The accessed mapping data from the messaging datastructure 202 can be stored in the same non-transitory computer-readablemedium or a different non-transitory computer-readable medium. In someembodiments, accessing the mapping data involves communicating, via adata bus, suitable signals between a local non-transitorycomputer-readable medium and the processing device. In additional oralternative embodiments, accessing the mapping data involvescommunicating, via a data network, suitable signals between a computingsystem that includes the non-transitory computer-readable medium and acomputing system that includes the processing device.

Continuing with the example above, the electronic address 210“joe.snuffy@domainX.xyz” associated with the target recipient entity 204has a domain part “domain.xyz.” The user analytics module 106 can accessknown gender attribute values associated with other electronic addresses210 having the same domain part for use in predicting the unknown genderattribute value associated with the target recipient entity 204.

At block 506, the process 500 involves determining a first distributionof the known values of the entity attribute 214 accessed from the firstmapping in block 502 and a second distribution of the known values ofthe entity attribute 214 accessed from the second mapping in block 504.

A processing device (e.g., one or more processors of the server system102) executes one or more modules of the message management application102 (or suitable other program code) to implement block 506. Forexample, the program code for the message management application 102,which is stored in a non-transitory computer-readable medium, isexecuted by one or more processing devices. Executing the messagemanagement application 102 causes the processing device to perform oneor more operations that implement the determination of block 506.

Continuing with the example above, the user analytics module 106 candetermine that, among the member recipient entities 206 that aresubscribed to “World News Weekly” with a known gender attribute value,four of the member recipient entities 206 are associated with “male”attribute values and two of the member recipient entities 206 areassociated with “female” attribute values. The user analytics module 106can use this distribution in predicting the unknown gender attributevalue associated with the target recipient entity 204. Continuing withthe example above, the user analytics module 106 can also determinethat, among the member recipient entities 206 that are associated withan electronic address 210 having the same domain part “domain.xyz” asthe electronic address 210 associated with the target recipient entity204, three are associated with a male gender attribute value and one isassociated with a female gender attribute value. The user analyticsmodule 106 can also use this distribution in predicting the unknowngender attribute value associated with the target recipient entity 204.

At block 508, the process 500 involves computing a predicted value ofthe unknown entity attribute 214 associated with the target recipiententity 204 based on the first distribution and the second distribution,both determined in block 506.

A processing device (e.g., one or more processors of the server system102) executes one or more modules of the message management application102 (or suitable other program code) to implement block 508. Forexample, the program code for the message management application 102,which is stored in a non-transitory computer-readable medium, isexecuted by one or more processing devices. Executing the messagemanagement application 102 causes the processing device to perform oneor more operations that implement the computation of block 508.

Continuing with the example above, the user analytics module 106 can usethe four-male-two-female first distribution and thethree-male-one-female second distribution to predict that the unknowngender attribute value associated with the target recipient entity 204is male.

At block 510, the process 500 involves updating the messaging datastructure 202 with the predicted value of the entity attribute 214associated with the target recipient entity 204 computed from the firstand second distributions. For example, the message managementapplication 104 (including any suitable module thereof) can configurethe server system 102 or another suitable computing system to implementblock 510. The server system 102 can access a non-transitorycomputer-readable medium in which the messaging data structure 202 isstored and thereby retrieve some or all of the data from the messagingdata structure 202. The server system 102 can access a portion of thedata in the messaging data structure 202 that describes the targetrecipient entity (e.g., one or more records for the target recipiententity). The server system 102 can modify the accessed portion of thedata in the messaging data structure 202 to include the predicted valueof the entity attribute 214. The server system 102 can store the updatedmessaging data structure 202 in the non-transitory computer-readablemedium.

At block 512, the process 500 involves servicing a query for electronicaddresses 210 having the predicted value of the entity attribute 214 byretrieving data describing the target entity. For example, the messagemanagement application 104 (including any suitable module thereof) canconfigure the server system 102 or another suitable computing system toimplement block 512. The server system 102 can communicate with one ormore vendor systems 132 via a data network 130. These communications caninclude, for example, one or more queries from one or more vendorsystems 132. The server system 102 can respond to a received query byaccessing a non-transitory computer-readable medium in which themessaging data structure 202 is stored. The server system 102 canservice the query by retrieving data from the messaging data structure202 that matches or otherwise corresponds to one or more searchparameters in a received query. The server system 102 can generate andtransmit, via the data network 130, a response to one or more vendorsystem 132. The response can include the data that the server system 102retrieved as a result of servicing the query.

In some embodiments, the message management application 102 can performone or more additional operations, such as accessing a third mapping of(1) electronic addresses 210 associated with a common first name and (2)known values of an entity attribute 214, wherein the electronicaddresses 210 include (a) the electronic address for a target recipiententity 204 without a known value of the entity attribute 214 and (b)electronic addresses 210 for member recipient entities 206 with knownvalues of the entity attribute 214. In this example, the electronicaddress 210 “joe.snuffy@domainX.xyz” is associated with the first name“Joe.” The user analytics module 106 can access known gender attributevalues associated with other electronic addresses 210 associated withthe first name “Joe” for use in predicting the unknown gender attributevalue associated with the target recipient entity 204.

In these embodiments, block 506 can involve determining a thirddistribution of the known values of the entity attribute 214 accessed inthe third mapping. Continuing with the example provided above, the useranalytics module 106 can determine that, among the other electronicaddresses 210 associated with the first name “Joe,” are associated withthe male gender attribute value. The user analytics module 106 can usethis distribution in predicting the unknown gender attribute valueassociated with the target recipient entity 204. Additionally oralternatively, block 508 can involve computing a predicted value of theunknown entity attribute 214 associated with the target recipient entity204 based on the first distribution, the second distribution, and thethird distribution. Continuing with the example provided above, the useranalytics module 106 can use the four-male-two-female firstdistribution, the three-male-one-female second distribution, and theall-male third distribution to predict that the unknown gender attributevalue associated with the target recipient entity 204 is male. In suchembodiments, at block 510, the process 500 involves updating themessaging data structure 202 with the predicted value of the entityattribute 214 associated with the target recipient entity 204 computedfrom the first, second, and third distributions.

In some embodiments, the message management application 102 can performone or more additional operations, such as applying weights to the firstdistribution and the second distribution by logistic regression modelingto generate a weighted first distribution and a weighted seconddistribution. For example, the user analytics module 106 can be trainedto give more predictive weight to a distribution of known values of anentity attribute 214 determined by accessing member recipient entities206 subscribed to the same online electronic content service 208 than toa distribution of known values of an entity attribute 214 determined byaccessing member recipient entities 206 associated with electronicaddresses 210 having the same domain part. The user analytics module 106can be trained to assign certain predictive weight to certaindistributions of known values of an entity attribute 214 using anysuitable software machine learning library. One example is thescikit-leam software machine learning library for the Python programminglanguage. In such embodiments, at block 508, the process 500 involvescomputing the predicted value of the entity attribute 214 for the targetrecipient entity 204 based on the weighted first distribution and theweighted second distribution. In such embodiments, at block 510, theprocess 500 involves updating the messaging data structure with thepredicted value of the entity attribute computed from the weighted firstdistribution and the weighted second distribution.

In some embodiments, the message management application 102 can performone or more additional operations, such as determining at least one ofthe first three-character sequence and the last three-character sequenceof the local part of the electronic address for the target recipiententity. The user analytics module 106 can be trained to associatecertain character sequences occurring in the local part of an e-mailaddress (e.g., the “jon.jones1980” part of the e-mail address“jon.jones1980@domainX.xyz”) with a certain age and/or gender attributevalue. The user analytics module 106 can be trained to associate certaincharacter sequences with a certain age and/or gender attribute valueusing any suitable software machine learning library. One example is thescikit-leam software machine learning library for the Python programminglanguage. For example, the user analytics module 106 can determine thatthe first three-character sequence (“trigram”) of the e-mail address“jon.jones1980@domainX.xyz” is “jon” and that the last trigram is “980.”The trained machine-learning algorithm can then determine that thetarget recipient entity 204 associated with that email address is likelya male (based on the “jon” trigram) and was likely born in the year 1980(based on the “980” trigram). In such embodiments, at block 508, theprocess 500 involves computing a predicted value of the unknown entityattribute 214 associated with the target recipient entity 204 based onthe first distribution, the second distribution, and at least one of thefirst three-character sequence and the last three-character sequence ofthe local part of the electronic address 210 for the target recipiententity 204. In such embodiments, at block 510, the process 500 involvesupdating the messaging data structure 202 with the predicted value ofthe entity attribute 214 associated with the target recipient entity 204computed from the first distribution, the second distribution, and atleast one of the first three-character sequence and the lastthree-character sequence of the local part of the electronic address 210for the target recipient entity 204.

In some embodiments, the message management application 102 can performone or more additional operations, such as determining a confidencelevel associated with the predicted value of the entity attribute 214based on whether the target recipient entity 204 has interacted with anelectronic message 112. For example, the user analytics module 106 candetermine that a given target recipient entity 204 is likely to be inthe age range of 35-45 years. The message management application 104 cansubsequently cause an electronic message 112 to be provided to theelectronic address 210 associated with the target recipient entity 204.The electronic message 112 can describe a product or service that istypically used by consumers in the age range of 40-50. If the messagemanagement application 104 determines that the target recipient entity204 with the predicted age range of 35-45 years has interacted with theelectronic message 112, which describes a product or service that istypically used by consumers in the age range of 40-50, the interactioncan provide further data indicating that the target recipient entity 204is within the age range of 35-45 years. If the message managementapplication 104 determines that the target recipient entity 204 with thepredicted age range of 35-45 years has not interacted with theelectronic message 112 in a certain way (e.g., clicking a product link),the absence of interaction can be data indicating that the targetrecipient entity 204 may not be within the age range of 35-45 years. Insuch embodiments, at block 510, the process 500 involves updating themessaging data structure 202 with the determined confidence levelassociated with the predicted value of the entity attribute 214.

In some embodiments, the message management application 102 can performone or more additional operations, such as computing the predicted valueof the entity attribute 214 for the target recipient entity 204 byapplying a Bayesian inference algorithm to the first distribution andthe second distribution. An example of such a computation is describedherein with respect to FIGS. 9-13.

Turning to FIG. 6, the recipient database 126 may include an electronicaddress “joe.snuffy@domainX.xyz.” The recipient database 126 may lack aknown value of an entity attribute 214 such as age or gender for thiselectronic address.

To generate estimates or predictions for this missing data, the useranalytics module 106 can use online electronic content service 208subscriptions of the electronic address “joe.snuffy@domainX.xyz”. Forexample, the user analytics module 106 can access the recipient database126 or another suitable data structure to identify which onlineelectronic content services 208 include the electronic address“joe.snuffy@domainX.xyz.”

The user analytics module 106 can also identify other member recipiententities 206 subscribed to the identified online electronic contentservices 208. The other member recipient entities 206 can include knownvalues of the entity attribute 214. The user analytics module 106 candetermine that other member recipient entities 206 subscribed to a givenonline electronic content service 208 have certain gender attributevalues (e.g., male) and ager attribute values (e.g., “age 33,” “age36”).

The user analytics module 106 can use the known values of the entityattribute 214 to generate a distribution of the known values of theentity attribute 214. In the example depicted in FIG. 6, the useranalytics module 106 generates or otherwise determines a distribution ofthe known values of the age entity attribute for each online electroniccontent service 208 to which the electronic address“joe.snuffy@domainX.xyz” subscribes. An example of an age distributionfor different online electronic content services 208 is depicted in FIG.7. The user analytics module 106 also generates or otherwise determinesa distribution of the known values of the gender entity attribute foreach online electronic content service 208 to which the electronicaddress “joe.snuffy@domainX.xy” subscribes. An example of a genderattribute value distribution for a different online electronic contentservice 208 is depicted in FIG. 8.

For a given entity attribute 214, the user analytics module 106 candetermine a likely attribute value based on a combination of knownentity attribute value distributions. In a simplified example, the useranalytics module 106 can determine that the electronic address“joe.snuffy@domainX.xyz” is subscribed to a first online electroniccontent service 208 for which the distribution of recipient ages is30-40. The user analytics module 106 can also determine that theelectronic address “joe.snuffy@domainX.xyz” is subscribed to a secondonline electronic content service 208 for which the distribution ofrecipient ages is 35-50. The user analytics module 106 can determinethat some overlap between these age ranges is likely to include the ageof the target recipient entity 204 with the electronic address“joe.snuffy@domainX.xyz.” For example, based on these distributions, theuser analytics module 106 can generate an estimated age range of 35-40for the target recipient entity 204 associated with the electronicaddress “joe.snuffy@domainX.xyz.” Similarly, the user analytics module106 can determine that if the distribution of the known values of thegender entity data for these online electronic content services 208 isheavily skewed toward males, then the target recipient entity 204associated with the electronic address “joe.snuffy@domainX.xyz” islikely a male.

In predicting an unknown value of the age entity attribute, the useranalytics module 106 can optionally determine a distribution of knownvalues of the age entity attribute as percentages of member recipiententities having a known age attribute value in various predetermined ageranges. The user analytics module 106 can determine multiple suchdistributions, for example one for each online electronic contentservice 208 that the target recipient entity 204 is subscribed to. Theuser analytics module 106 can average these multiple distributions intoa single distribution for use as an input to one or more suitableautomated modeling algorithms executed by the message managementapplication 102 to compute predicted attribute values.

The user analytics module 106 can also use different known entityattribute value distributions in combination with one another to predictor otherwise determine an unknown value of an entity attribute 214. Forexample, the user analytics module 106 can determine that memberrecipient entities 206 subscribed to a first online electronic contentservice 208 are heavily concentrated among women of ages 20-25 and malesof ages 35-40. The user analytics module 106 can also determine thatmember recipient entities 206 subscribed to a second online electroniccontent service 208 are heavily concentrated among persons of ages35-50. The user analytics module 106 can thereby predict that if thetarget recipient entity 204 associated with the electronic address“joe.snuffy@domainX.xyz” is subscribed to both of these onlineelectronic content services 208, he is likely in the age range of 35-40(based on the overlap in age ranges) and is likely to be a male (basedon subscribers to the first online electronic content service 208 withinthe 35-40 age range typically being males).

Any suitable entity attribute 214 can be used or predicted by the useranalytics module 106. Examples of suitable entity attributes 214 at theindividual level include (but are not limited to) first name, last name,title, gender or inferred gender, address (country, state, city, zipcode) and general location information, birthdate or inferred birthyear/age, birthday, company, username, online electronic content service208 subscriptions, and geolocation. Examples of suitable entityattributes at the list level include (but are not limited to) genderdistribution and age distribution. An application programming interface(“API”) can be implemented to query known and predicted attributesassociated with an electronic address 210. An API can also beimplemented to query distributions of attribute data associated with alist.

Although the simplified examples described herein involve relatively fewmember recipient entities 206, accurate predictions of entity attributedata may involve large volumes of data that require analysis viasuitable computing systems. For example, the recipient database 126 maylack entity attribute data for large numbers (e.g., thousands) of memberrecipient entities 206, may lack reliable data for member recipiententities 206 (e.g., due to spammers providing false user data to themessage management application 104), or may otherwise include gaps indata that would be used to predict entity attribute data. A sufficientlylarge pool of addresses must therefore be used to minimize the impact ofthese gaps in entity attribute data or incorrect entity attribute datawhen building distributions of entity attribute data (e.g., agedistributions, gender distributions, etc.). For example, entityattribute data for over one million member recipient entities 206 may beneeded to minimize the impact of having missing data or false data forseveral thousand member recipient entities 206. The volume of datarequired to generate entity attribute distributions that accuratelyreflect the subscribers to certain types of online electronic contentservices 208 can require the use of a computing architecture capable ofprocessing these large data sets.

The message management application 104 can utilize any suitablearchitecture for storing and analyzing large volumes of entity attributedata. One example of such an implementation is Elasticsearch for storingand organizing user data (e.g., in the recipient database 126) and aBayesian inference modeling technique for generating and analyzingdistributions of entity attribute data. For example, six Elasticsearchnodes can be used to store, aggregate, and cache over ten billionrecords, making use of linear algebra with the NumPy Python package.Also for example, Elasticsearch can be used by operations and deliveryfor logging, for horizontal scaling, for allowing faster access tosubscriber data, and for aggregating across common variables key toanalytical models. In additional or alternative embodiments, othersuitable storage architectures, other predictive modeling techniques, orsome combination thereof may be used.

One or more suitable automated modeling algorithms can be executed bythe message management application 102 to compute predicted attributevalues. An automated modeling algorithm (e.g., an algorithm usinglogistic regression, Bayesian inference, neural networks, etc.) that canlearn or otherwise identify relationships between known attributes andunknown attributes. An automated modeling algorithm is trained usinglarge volumes of training data. This training data, which can begenerated by online interactions with one or more of electronic messages112 or online content 142, is analyzed by one or more computing devices(e.g., a server system 102). The training data is grouped intoattributes, which are provided as inputs to the automated modelingalgorithm. The automated modeling algorithm analyzes these attributes tolearn from and make predictions regarding data obtained from onlinetransactions. For example, the automated modeling algorithm uses theattributes to learn how to predict a certain unknown attribute value(e.g., age, gender, etc.) based on a context involving other attributevalues (e.g., subscription, domain names, n-grams or other tokenizeddata derived from electronic addresses, etc.) similar to attributes fromthe training data (e.g., a certain combination of subscription anddomain attribute values indicating a high likelihood of a “male”attribute value). This training and predicting can be accomplished usingany suitable software machine learning library. One example is thescikit-leam software machine learning library for the Python programminglanguage.

FIGS. 9-13 depict an example of using Bayesian inference to predict agender for a given target recipient entity 204. The example provided inFIGS. 9-13 is provided for illustrative purposes. In additional oralternative embodiments, other suitable predictive modeling techniquesin addition to or other than Bayesian inference may be used, such aslogistic regression, classification and regression tree, random forests,gradient tree boosting, etc.

In this example, the user analytics module 106 identifies an electronic“cass@domainX.xyz” in the recipient database 126, as depicted in FIG. 9.The user analytics module 106 determines that the electronic address 210includes a domain (i.e., “domainX.xyz”), as depicted in FIG. 10. Formember recipient entities 206 having gender data for that domain, 52%are determined to be “male” users and 48% are determined to be “female”users, as depicted in FIG. 10.

In this example, the user analytics module 106 also determines that theelectronic address 210 is associated with a first name 212 (i.e.,“Cass”), as depicted in FIG. 11. The user analytics module 106determines that the first name “Cass” is associated with female memberrecipient entities 206 more frequently than with male member recipiententities 206, as depicted in FIG. 11. The user analytics module 106 canmake this determination based on, for example, all available memberrecipient entities 206 (whether or not associated with the “domainX.xyz”domain) having both a known gender and the name “Cass” or a derivativeof “Cass” (e.g., “Cassie,” “Cassandra,” “Castor,” etc.).

In this example, the user analytics module 106 also determines that theelectronic address 210 is subscribed to the “Developers Daily” onlineelectronic content service 208, as depicted in FIG. 12. The useranalytics module 106 determines, based on analyzing member recipiententities 206 associated with the online electronic content service 208and having known gender data, that the majority of member recipiententities 206 subscribed to the “Developers Daily” online electroniccontent service 208 are male, as depicted in FIG. 12. The user analyticsmodule 106 can also determine that other online electronic contentservices 208 to which the electronic address “cass@domainX.xyz” issubscribed are skewed toward the male gender, as depicted in FIG. 13.

The user analytics module 106 can predict a gender associated with theelectronic address “cass@domainX.xyz” based on a Bayesian inferencealgorithm or other suitable predictive modeling techniques. For example,as depicted in FIG. 13, the user analytics module 106 predicts that thegender for the target recipient entity 204 is “male” based on thecombination of distributions depicted in FIG. 13, which (consideredtogether) are more indicative of a male target recipient entity 204 thana female target recipient entity 204.

In some embodiments, different weights can be applied to different typesof entity attribute distributions when predicting an entity attributevalue for a given target recipient entity 204. In the example depictedin FIGS. 9-13, a first name 212 of a target recipient entity 204 may bemore indicative of his age than his geographic location. Thus, if theuser analytics module 106 is predicting the age of a target recipiententity 204, a distribution of age data for a given first name 212associated with an electronic address 210 may be given a greater weightthan a distribution of age data for a given geographic region associatedwith the electronic address 210.

The user analytics module 106 or other suitable program module candetermine weights for different types of distributions using suitablemodel training. For example, a supervised machine-learning algorithm(e.g., a neural network) can be trained to associate certain names withcertain demographic information (e.g., age, gender, etc.). The trainingcan be performed by providing a data set with verified data to themachine-learning algorithm. The data set can be verified for suitablevariance before being provided to the machine-learning algorithm, toavoid use of a data set having near-zero variance for example. Thetrained machine-learning algorithm can be used to determine thelikelihood of one entity attribute value (e.g., the name “Cass”) beingassociated with another entity attribute value (e.g., the gender“female”). The user analytics module 106 or other suitable programmodule can use the likelihood to apply appropriate weights to differententity attribute distributions when predicting a certain entityattribute value.

In the same manner, the supervised machine-learning algorithm (e.g., aneural network) can also be trained to associate certain charactersequences occurring in the local part of an e-mail address (e.g., the“jon.jones1980” part of the e-mail address “jon.jones1980@domainX.xyz”)with a certain age and/or gender. For example, the user analytics module106 can determine that the first three-character sequence (“trigram”) ofthe e-mail address “jon.jones1980@domainX.xyz” is “jon” and that thelast trigram is “980.” The trained machine-learning algorithm can thendetermine that the target recipient entity 204 associated with thatemail address is likely a male (based on the “jon” trigram) and waslikely born in the year 1980 (based on the “980” trigram).

In some embodiments, the message management application 104 can useresponsive electronic data generated by interactions with electronicmessages 112 to assign or modify a confidence level associated with apredicted entity attribute value. For example, the user analytics module106 can determine that a given target recipient entity 204 is likely tobe in the age range of 35-45 years. The message management application104 can subsequently cause an electronic message 112 to be provided tothe electronic address 210 associated with the target recipient entity204. The electronic message 112 can describe a product or service thatis typically used by consumers in the age range of 40-50. If the messagemanagement application 104 determines that the target recipient entity204 with the predicted age range of 35-45 years has interacted with theelectronic message 112, which describes a product or service that istypically used by consumers in the age range of 40-50, the interactioncan provide further data indicating that the target recipient entity 204is within the age range of 35-45 years. If the message managementapplication 104 determines that the target recipient entity 204 with thepredicted age range of 35-45 years has not interacted with theelectronic message 112 in a certain way (e.g., clicking a product link),the absence of interaction can be data indicating that the targetrecipient entity 204 may not be within the age range of 35-45 years.

Assigning or modifying a confidence level can also involve receivingresponsive electronic data that is automatically generated byinteractions with electronic messages 112. For example, the messagemanagement application 104 can be executed by a suitable processingdevice to perform one or more operations suitable for assigning ormodifying a confidence level, including for example receiving responsiveelectronic data that indicates how the target recipient entity 204interacted with an electronic message 112 (e.g., opening the electronicmessages 112, clicking on links in the electronic messages 112, etc.).The responsive electronic data can be any data that is automaticallygenerated or provided to the message management application 104 as aresult of the target recipient entity 204 interacting with theelectronic message 112.

The responsive electronic data can be generated in any suitable manner.In some embodiments, an electronic message 112 can include program codethat causes a notification to be transmitted from a recipient device 136to the server system 102 in response to the electronic message 112 beingopened at the recipient device 136. The notification can be transmittedto the server system 102 without notifying a viewer of the electronicmessage 112 at the recipient device 136.

The message management application 104 can receive any type ofresponsive electronic data as a result of a recipient device 136associated with a target recipient entity 204 interacting with anelectronic message 112. The responsive electronic data can be generatedin any suitable manner. In some embodiments, electronic messages 112 caninclude program code that causes a notification to be transmitted from arecipient device 136 to the server system 102 in response to theelectronic message 112 being opened at the recipient device 136. Thenotification can be transmitted to the server system 102 withoutnotifying a viewer of the electronic message 112 at the recipient device136.

In other embodiments, the responsive electronic data can include datathat is provided to the message management application 104 as a resultof the recipient device 136 accessing online content 142 via anelectronic message 112. For example, a link to the online content 142that is included in an electronic message 112 may include a URLparameter that causes the web server 140 to notify the server system 102that a link has been clicked. A non-limiting example of the parameter isan alphanumeric string that provides an identifier for a campaigninvolving the transmission of the electronic messages 112. The webserver 140 can use the identifier included in the URL parameter touniquely identify a visit to the website. The web server 140 can respondto receiving the URL parameter by notifying the server system 102 that arecipient device 136 to which an electronic message 112 was transmittedaccessed the online content 142 during a certain time period.

Example of a System Implementation

Any suitable computing system or group of computing systems can be usedto implement the server system 102. For example, FIG. 14 is a blockdiagram depicting an example of a server system 102 that executes amessage management application for optimizing the effectiveness ofdifferent electronic message versions.

The server system 102 can include a processor 802 that iscommunicatively coupled to a memory 804. The processor 802 performs oneor more of executing computer-executable program code stored in thememory 804 and accessing information stored in the memory 804. Whenexecuted by the processor 802, instructions stored in the memory 804cause the processor 802 to perform one or more operations describedherein. The processor 802 may include a microprocessor, anapplication-specific integrated circuit (“ASIC”), or other processingdevice. The processor 802 can include any of a number of processingdevices, including one.

The memory 804 can include any suitable computer-readable medium. Thecomputer-readable medium can include any electronic, optical, magnetic,or other storage device capable of providing a processor withcomputer-readable instructions or other program code. Non-limitingexamples of a computer-readable medium include a CD-ROM, DVD, magneticdisk, memory chip, ROM, RAM, an ASIC, optical storage, magnetic tape orother magnetic storage, or any other medium from which a computerprocessor can read program code. The program code may includeprocessor-specific instructions generated by one or more of a compilerand an interpreter from code written in any suitablecomputer-programming language, including, for example, C, C++, C#,Visual Basic, Java, Python, Perl, JavaScript, and ActionScript.

The server system 102 may also include a number of external or internaldevices such as input or output devices. For example, the server system102 is shown with an input/output (“I/O”) interface 808 that can receiveinput from input devices or provide output to output devices. A bus 806can also be included in the server system 102. The bus 806 cancommunicatively couple one or more components of the server system 102.

The server system 102 can execute program code that configures theprocessor 802 to perform one or more of the operations described abovewith respect to FIGS. 1-13. The program code can include, for example,the message management application 104. The program code may be residentin the memory 804 or any suitable computer-readable medium and may beexecuted by the processor 802 or any other suitable processor. In someembodiments, the electronic messages 112 and associated data can beresident in the memory 804, as depicted in FIG. 14. In otherembodiments, one or more of the electronic messages 112 and otherassociated data can be resident in a memory that is accessible via adata network, such as a memory accessible via a cloud service or otherdata network service.

The server system 102 can also include at least one network interface810. The network interface 810 can include any device or group ofdevices suitable for establishing a wired or wireless data connection toone or more data networks 130. Non-limiting examples of the networkinterface 810 include an Ethernet network adapter, a modem, and anyother suitable communication device. The server system 102 cancommunicate with one or more vendor systems 132, one of more recipientdevices 136, or both using the network interface 810.

General Considerations

Numerous specific details are set forth herein to provide a thoroughunderstanding of the claimed subject matter. However, those skilled inthe art will understand that the claimed subject matter may be practicedwithout these specific details. In other instances, methods,apparatuses, or systems that would be known by one of ordinary skillhave not been described in detail so as not to obscure claimed subjectmatter.

Unless specifically stated otherwise, it is appreciated that throughoutthis specification discussions utilizing terms such as “processing,”“computing,” “calculating,” “determining,” and “identifying” or the likerefer to actions or processes of a computing device, such as one or morecomputers or a similar electronic computing device or devices, thatmanipulate or transform data represented as physical electronic ormagnetic quantities within memories, registers, or other informationstorage devices, transmission devices, or display devices of thecomputing platform.

The system or systems discussed herein are not limited to any particularhardware architecture or configuration. A computing device can includeany suitable arrangement of components that provides a resultconditioned on one or more inputs. Suitable computing devices includemultipurpose microprocessor-based computer systems accessing storedsoftware that programs or configures the computing system from a generalpurpose computing apparatus to a specialized computing apparatusimplementing one or more embodiments of the present subject matter. Anysuitable programming, scripting, or other type of language orcombinations of languages may be used to implement the teachingscontained herein in software to be used in programming or configuring acomputing device.

Embodiments of the methods disclosed herein may be performed in theoperation of such computing devices. The order of the blocks presentedin the examples above can be varied—for example, blocks can bere-ordered, combined, broken into sub-blocks, or some combinationthereof. Certain blocks or processes can be performed in parallel.

The use of “adapted to” or “configured to” herein is meant as open andinclusive language that does not foreclose devices adapted to orconfigured to perform additional tasks or steps. Additionally, the useof “based on” is meant to be open and inclusive, in that a process,step, calculation, or other action “based on” one or more recitedconditions or values may, in practice, be based on additional conditionsor values beyond those recited. Headings, lists, and numbering includedherein are for ease of explanation only and are not meant to belimiting.

While the present subject matter has been described in detail withrespect to specific embodiments thereof, it will be appreciated thatthose skilled in the art, upon attaining an understanding of theforegoing, may readily produce alterations to, variations of, andequivalents to such embodiments. Accordingly, it should be understoodthat the present disclosure has been presented for purposes of examplerather than limitation, and does not preclude the inclusion of suchmodifications, variations, additions to the present subject matter aswould be readily apparent to one of ordinary skill in the art.

1. A method for modifying messaging data structures having unknownattribute values associated with recipient entities to facilitateretrieval of address data for electronic communications with therecipient entities, the method comprising: accessing, by a processingdevice, a first portion of a messaging data structure storing dataidentifying a first mapping among an online electronic content service,first electronic addresses subscribed to the online electronic contentservice, and an entity attribute, wherein the first electronic addressesinclude (i) a target electronic address for a target recipient entity,the target electronic address having a local part and a domain part and(ii) a first plurality of electronic addresses associated with firstmember recipient entities, wherein the first member recipient entitiesare respectively associated with first known values of the entityattribute in the first portion of the messaging data structure;accessing, by the processing device, a second portion of the messagingdata structure storing data identifying a second mapping of secondelectronic addresses, a common domain part identified in the secondelectronic addresses, and the entity attribute, wherein the secondelectronic addresses include (i) the target electronic address for thetarget recipient entity and (ii) a second plurality of electronicaddresses associated with second member recipient entities, wherein thesecond member recipient entities are respectively associated with secondknown values of the entity attribute in the second portion of themessaging data structure; determining, by the processing device, a firstdistribution of the first known values of the entity attribute accessedfrom the first portion of the messaging data structure and a seconddistribution of the second known values of the entity attribute accessedfrom the second portion of the messaging data structure; computing, bythe processing device, a predicted value of the entity attribute for thetarget recipient entity based on the first distribution and the seconddistribution; updating, by the processing device, the messaging datastructure with the predicted value of the entity attribute computed fromthe first distribution and the second distribution; and servicing, bythe processing device, a query for electronic addresses having thepredicted value of the entity attribute by retrieving data describingthe target recipient entity from the messaging data structure.
 2. Themethod of claim 1, further comprising: accessing, by the processingdevice, a third portion of the messaging data structure storing dataidentifying a third mapping of third electronic addresses, a commonfirst name associated with the third electronic addresses, and theentity attribute, wherein the third electronic addresses include (i) thetarget electronic address for the target recipient entity and (ii) athird plurality of electronic addresses associated with third memberrecipient entities, wherein the third member recipient entities arerespectively associated with third known values of the entity attributein the third portion of the messaging data structure; wherein theprocessing device further determines a third distribution of the thirdknown values of the entity attribute accessed from the third portion ofthe messaging data structure; wherein the processing device computes thepredicted value of the entity attribute for the target recipient entitybased on the first distribution, the second distribution, and the thirddistribution; and wherein the processing device updates the messagingdata structure with the predicted value of the entity attribute computedfrom the first distribution, the second distribution, and the thirddistribution.
 3. The method of claim 1, wherein the entity attributeassociated with the target recipient entity is age or gender.
 4. Themethod of claim 1, further comprising: applying weights, by theprocessing device, to the first distribution and the second distributionby logistic regression modeling to generate a weighted firstdistribution and a weighted second distribution; wherein the processingdevice computes the predicted value of the entity attribute for thetarget recipient entity based on the weighted first distribution and theweighted second distribution; and wherein the processing device updatesthe messaging data structure with the predicted value of the entityattribute computed from the weighted first distribution and the weightedsecond distribution.
 5. The method of claim 1, further comprising:determining, by the processing device, at least one of the firstthree-character sequence and the last three-character sequence of thelocal part of the target electronic address for the target recipiententity; and wherein the processing device computes the predicted valueof the entity attribute for the target recipient entity based on thefirst distribution, the second distribution, and at least one of thefirst three-character sequence and the last three-character sequence ofthe local part of the target electronic address; and wherein theprocessing device updates the messaging data structure with thepredicted value of the entity attribute computed from the firstdistribution, the second distribution, and at least one of the firstthree-character sequence and the last three-character sequence of thelocal part of the target electronic address.
 6. The method of claim 1,further comprising: determining, by the processing device, a confidencelevel associated with the predicted value of the entity attribute basedon whether the target recipient entity has interacted with an electronicmessage; and updating, by the processing device, the messaging datastructure with the confidence level associated with the predicted valueof the entity attribute.
 7. The method of claim 1, wherein theprocessing device computes the predicted value of the entity attributefor the target recipient entity by Bayesian inferencing the firstdistribution and the second distribution.
 8. A system comprising: aprocessing device; and a non-transitory computer-readable mediumcommunicatively coupled to the processing device, wherein the processingdevice is configured for executing program code stored in thenon-transitory computer-readable medium to perform operationscomprising: accessing a first portion of a messaging data structurestoring data identifying a first mapping among an online electroniccontent service, first electronic addresses subscribed to the onlineelectronic content service, and an entity attribute, wherein the firstelectronic addresses include (i) a target electronic address for atarget recipient entity, the target electronic address having a localpart and a domain part and (ii) a first plurality of electronicaddresses associated with first member recipient entities, wherein thefirst member recipient entities are respectively associated with firstknown values of the entity attribute in the first portion of themessaging data structure; accessing a second portion of the messagingdata structure storing data identifying a second mapping of secondelectronic addresses, a common domain part identified in the secondelectronic addresses, and the entity attribute, wherein the secondelectronic addresses include (i) the target electronic address for thetarget recipient entity and (ii) a second plurality of electronicaddresses associated with second member recipient entities, wherein thesecond member recipient entities are respectively associated with secondknown values of the entity attribute in the second portion of themessaging data structure; determining a first distribution of the firstknown values of the entity attribute accessed from the first portion ofthe messaging data structure and a second distribution of the secondknown values of the entity attribute accessed from the second portion ofthe messaging data structure; computing a predicted value of the entityattribute for the target recipient entity based on the firstdistribution and the second distribution; updating the messaging datastructure with the predicted value of the entity attribute computed fromthe first distribution and the second distribution; and servicing aquery for electronic addresses having the predicted value of the entityattribute by retrieving data describing the target recipient entity fromthe messaging data structure.
 9. The system of claim 8, the operationsfurther comprising: accessing a third portion of the messaging datastructure storing data identifying a third mapping of third electronicaddresses, a common first name associated with the third electronicaddresses, and the entity attribute, wherein the third electronicaddresses include (i) the target electronic address for the targetrecipient entity and (ii) a third plurality of electronic addressesassociated with third member recipient entities, wherein the thirdmember recipient entities are respectively associated with third knownvalues of the entity attribute in the third portion of the messagingdata structure; wherein the determining operation further determines athird distribution of the third known values of the entity attributeaccessed from the third portion of the messaging data structure; whereinthe computing operation computes the predicted value of the entityattribute for the target recipient entity based on the firstdistribution, the second distribution, and the third distribution; andwherein the updating operation updates the messaging data structure withthe predicted value of the entity attribute computed from the firstdistribution, the second distribution, and the third distribution. 10.The system of claim 8, wherein the entity attribute associated with thetarget recipient entity is age or gender.
 11. The system of claim 8, theoperations further comprising: applying weights to the firstdistribution and the second distribution by logistic regression modelingto generate a weighted first distribution and a weighted seconddistribution; wherein the computing operation computes the predictedvalue of the entity attribute for the target recipient entity based onthe weighted first distribution and the weighted second distribution;and wherein the updating operation updates the messaging data structurewith the predicted value of the entity attribute computed from theweighted first distribution and the weighted second distribution. 12.The system of claim 8, the operations further comprising: determining atleast one of the first three-character sequence and the lastthree-character sequence of the local part of the target electronicaddress for the target recipient entity; and wherein the computingoperation computes the predicted value of the entity attribute for thetarget recipient entity based on the first distribution, the seconddistribution, and at least one of the first three-character sequence andthe last three-character sequence of the local part of the targetelectronic address; and wherein the updating operation updates themessaging data structure with the predicted value of the entityattribute computed from the first distribution, the second distribution,and at least one of the first three-character sequence and the lastthree-character sequence of the local part of the target electronicaddress.
 13. The system of claim 8, the operations further comprising:determining a confidence level associated with the predicted value ofthe entity attribute based on whether the target recipient entity hasinteracted with an electronic message; and updating the messaging datastructure with the confidence level associated with the predicted valueof the entity attribute.
 14. The system of claim 8, wherein thecomputing operation computes the predicted value of the entity attributefor the target recipient entity by Bayesian inferencing the firstdistribution and the second distribution.
 15. A non-transitorycomputer-readable medium having instructions stored thereon, theinstructions executable by a processing device to perform operationscomprising: accessing a first portion of a messaging data structurestoring data identifying a first mapping among an online electroniccontent service, first electronic addresses subscribed to the onlineelectronic content service, and an entity attribute, wherein the firstelectronic addresses include (i) a target electronic address for atarget recipient entity, the target electronic address having a localpart and a domain part and (ii) a first plurality of electronicaddresses associated with first member recipient entities, wherein thefirst member recipient entities are respectively associated with firstknown values of the entity attribute in the first portion of themessaging data structure; accessing a second portion of the messagingdata structure storing data identifying a second mapping of secondelectronic addresses, a common domain part identified in the secondelectronic addresses, and the entity attribute, wherein the secondelectronic addresses include (i) the target electronic address for thetarget recipient entity and (ii) a second plurality of electronicaddresses associated with second member recipient entities, wherein thesecond member recipient entities are respectively associated with secondknown values of the entity attribute in the second portion of themessaging data structure; determining a first distribution of the firstknown values of the entity attribute accessed from the first portion ofthe messaging data structure and a second distribution of the secondknown values of the entity attribute accessed from the second portion ofthe messaging data structure; computing a predicted value of the entityattribute for the target recipient entity based on the firstdistribution and the second distribution; updating the messaging datastructure with the predicted value of the entity attribute computed fromthe first distribution and the second distribution; and servicing aquery for electronic addresses having the predicted value of the entityattribute by retrieving data describing the target recipient entity fromthe messaging data structure.
 16. The non-transitory computer-readablemedium of claim 15, the operations further comprising: accessing a thirdportion of the messaging data structure storing data identifying a thirdmapping of third electronic addresses, a common first name associatedwith the third electronic addresses, and the entity attribute, whereinthe third electronic addresses include (i) the target electronic addressfor the target recipient entity and (ii) a third plurality of electronicaddresses associated with third member recipient entities, wherein thethird member recipient entities are respectively associated with thirdknown values of the entity attribute in the third portion of themessaging data structure; wherein the determining operation furtherdetermines a third distribution of the third known values of the entityattribute accessed from the third portion of the messaging datastructure; wherein the computing operation computes the predicted valueof the entity attribute for the target recipient entity based on thefirst distribution, the second distribution, and the third distribution;and wherein the updating operation updates the messaging data structurewith the predicted value of the entity attribute computed from the firstdistribution, the second distribution, and the third distribution. 17.The non-transitory computer-readable medium of claim 15, wherein theentity attribute associated with the target recipient entity is age orgender.
 18. The non-transitory computer-readable medium of claim 15, theoperations further comprising: applying weights to the firstdistribution and the second distribution by logistic regression modelingto generate a weighted first distribution and a weighted seconddistribution; wherein the computing operation computes the predictedvalue of the entity attribute for the target recipient entity based onthe weighted first distribution and the weighted second distribution;and wherein the updating operation updates the messaging data structurewith the predicted value of the entity attribute computed from theweighted first distribution and the weighted second distribution. 19.The non-transitory computer-readable medium of claim 15, the operationsfurther comprising: determining at least one of the firstthree-character sequence and the last three-character sequence of thelocal part of the target electronic address for the target recipiententity; and wherein the computing operation computes the predicted valueof the entity attribute for the target recipient entity based on thefirst distribution, the second distribution, and at least one of thefirst three-character sequence and the last three-character sequence ofthe local part of the target electronic address; and wherein theupdating operation updates the messaging data structure with thepredicted value of the entity attribute computed from the firstdistribution, the second distribution, and at least one of the firstthree-character sequence and the last three-character sequence of thelocal part of the target electronic address.
 20. The non-transitorycomputer-readable medium of claim 15, the operations further comprising:determining a confidence level associated with the predicted value ofthe entity attribute based on whether the target recipient entity hasinteracted with an electronic message; and updating the messaging datastructure with the confidence level associated with the predicted valueof the entity attribute.
 21. The non-transitory computer-readable mediumof claim 15, wherein the computing operation computes the predictedvalue of the entity attribute for the target recipient entity byBayesian inferencing the first distribution and the second distribution.